Getting the Best from Uncertain Data
نویسندگان
چکیده
The skyline of a relation is the set of tuples that are not dominated by any other tuple in the same relation, where tuple u dominates tuple v if u is no worse than v on all the attributes of interest and strictly better on at least one attribute. Previous attempts to extend skyline queries to probabilistic databases have proposed either a weaker form of domination, which is unsuitable to univocally define the skyline, or a definition that implies algorithms with exponential complexity. In this paper we demonstrate how, given a semantics for linearly ranking probabilistic tuples, the skyline of a probabilistic relation can be univocally defined. Our approach preserves the three fundamental properties of skyline: 1) it equals the union of all top-1 results of monotone scoring functions, 2) it requires no additional parameter to be specified, and 3) it is insensitive to actual attribute scales. We also detail efficient sequential and index-based algorithms.
منابع مشابه
Getting the Best from Uncertain Data: the Correlated Case
In this extended abstract we apply the notion of skyline to the case of probabilistic relations including correlation among tuples. In particular, we consider the relevant case of the x-relation model, consisting of a set of generation rules specifying the mutual exclusion of tuples. We show how our definitions apply to different ranking semantics and analyze the time complexity for the resolut...
متن کاملThe challenge of getting a high quality of RNA from oocyte for gene expression study
The extraction of intact RNA from oocyte is quite challenging and time-consuming. A standard protocol using commercial RNA extraction kit, yields a low quantity of RNA in oocytes. In the past, several attempts in getting RNA for gene expression study ended up with a few different modified methods. Extraction of high-quality RNA from oocyte is important before further downstream analyses such as...
متن کاملSensitivity Analysis of Spatial Sampling Designs for Optimal Prediction
In spatial statistic, the data analyzed which is correlated and this correlation is due to their locations in the studied region. Such correlation that is related to distance between observations is called spatial correlation. Usually in spatial data analysis, the prediction of the amount of uncertain quantity in arbitrary 4locations of the area is considered according to attained observations ...
متن کاملA Bayesian mixture model for classification of certain and uncertain data
There are different types of classification methods for classifying the certain data. All the time the value of the variables is not certain and they may belong to the interval that is called uncertain data. In recent years, by assuming the distribution of the uncertain data is normal, there are several estimation for the mean and variance of this distribution. In this paper, we co...
متن کاملRobust Economic-Statistical Design of Acceptance Control Chart
Acceptance control charts (ACC), as an effective tool for monitoring highly capable processes, establish control limits based on specification limits when the fluctuation of the process mean is permitted or inevitable. For designing these charts by minimizing economic costs subject to statistical constraints, an economic-statistical model is developed in this paper. However, the parameters of s...
متن کاملClustering of Uncertain Data Objects using Improved K-means Algorithm
Recently data mining over the uncertain data attracts more attention of the data mining. The uncertainty occurs in a information because of the inaccurate measurement of the results, like scientific results, data gathered from sensor network, measuring temperature, humidity, pressure and so on. from such a sources there is possibility of getting the uncertainty in a data. Main task is to handle...
متن کامل